Genome-wide association study for cashmere traits in Inner Mongolia cashmere goat population reveals new candidate genes and haplotypes

Background The cashmere goat industry is one of the main pillars of animal husbandry in Inner Mongolia Autonomous Region, and plays an irreplaceable role in local economic development. With the change in feeding methods and environment, the cashmere produced by Inner Mongolia cashmere goats shows a tendency of coarser, and the cashmere yield can not meet the consumption demand of people. However, the genetic basis behind these changes is not fully understood. We measured cashmere traits, including cashmere yield (CY), cashmere diameter (CD), cashmere thickness (CT), and fleece length (FL) traits for four consecutive years, and utilized Genome-wide association study of four cashmere traits in Inner Mongolia cashmere goats was carried out using new genomics tools to infer genomic regions and functional loci associated with cashmere traits and to construct haplotypes that significantly affect cashmere traits. Results We estimated the genetic parameters of cashmere traits in Inner Mongolia cashmere goats. The heritability of cashmere yield, cashmere diameter, and fleece length traits of Inner Mongolia cashmere goats were 0.229, 0.359, and 0.250, which belonged to the medium heritability traits (0.2 ~ 0.4). The cashmere thickness trait has a low heritability of 0.053. We detected 151 genome-wide significantly associated SNPs with four cashmere traits on different chromosomes, which were very close to the chromosomes of 392 genes (located within the gene or within ± 500 kb). Notch3, BMPR1B, and CCNA2 have direct functional associations with fibroblasts and follicle stem cells, which play important roles in hair follicle growth and development. Based on GO functional annotation and KEGG enrichment analysis, potential candidate genes were associated with pathways of hair follicle genesis and development (Notch, P13K-Akt, TGF-beta, Cell cycle, Wnt, MAPK). We calculated the effective allele number of the Inner Mongolia cashmere goat population to be 1.109–1.998, the dominant genotypes of most SNPs were wild-type, the polymorphic information content of 57 SNPs were low polymorphism (0 < PIC < 0.25), and the polymorphic information content of 79 SNPs were moderate polymorphism (0.25 < PIC < 0.50). We analyzed the association of SNPs with phenotypes and found that the homozygous mutant type of SNP1 and SNP3 was associated with the highest cashmere yield, the heterozygous mutant type of SNP30 was associated with the lowest cashmere thickness, the wild type of SNP76, SNP77, SNP78, SNP80, and SNP81 was associated with the highest cashmere thickness, and the wild type type of SNP137 was associated with the highest fleece length. 21 haplotype blocks and 68 haplotype combinations were constructed. Haplotypes A2A2, B2B2, C2C2, and D4D4 were associated with increased cashmere yield, haplotypes E2E2, F1F1, G5G5, and G1G5 were associated with decreased cashmere fineness, haplotypes H2H2 was associated with increased cashmere thickness, haplotypes I1I1, I1I2, J1J4, L5L3, N3N2, N3N3, O2O1, P2P2, and Q3Q3 were associated with increased cashmere length. We verified the polymorphism of 8 SNPs by KASP, and found that chr7_g.102631194A > G, chr10_g.82715068 T > C, chr1_g.124483769C > T, chr24_g.12811352C > T, chr6_g.114111249A > G, and chr6_g.115606026 T > C were significantly genotyped in verified populations (P < 0.05). Conclusions In conclusion, the genetic effect of single SNP on phenotypes is small, and SNPs are more inclined to be inherited as a whole. By constructing haplotypes from SNPs that are significantly associated with cashmere traits, it will help to reveal the complex and potential causal variations in cashmere traits of Inner Mongolia cashmere goats. This will be a valuable resource for genomics and breeding of the cashmere goat. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-024-10543-4.


Introduction
Inner Mongolia cashmere goat is an excellent local livestock breed after long-term natural selection and artificial systematic breeding.It was included in the "National List of Livestock and Poultry Breeds Protection" in 2000 and the "National List of Livestock and Poultry Genetic Resources Protection" in 2006.In 2008, the breeding area of the Inner Mongolia Cashmere Goat was designated as a national breeding herd and protected area.Inner Mongolia cashmere goat is the state expressly prohibited the export of livestock species [1].The cashmere produced by Inner Mongolia cashmere goats has the advantages of slimness, softness and good luster, which is loved by domestic and foreign consumers.It is the only export livestock product with pricing rights in China, which plays an important role in economic development.With the current transformation and upgrading of the livestock industry and the improvement of people's consumption demand, the current cashmere production capacity of Inner Mongolia cashmere goats can not meet the current demand.Therefore, to improve the performance of cashmere production has become one of the problems to be solved, and the genetic improvement of cashmere production performance has become the focus of breeding work.
GWAS, as an important tool for genetic localization, has been playing an important role in resolving the genetic basis of economically important traits in livestock and poultry.Cashmere trait is an important indicator of fiber yield and quality during goats domestication.However, the gene regulation behind the selection of cashmere traits is still unclear.Wang et al. [2] used the Illumina Goat 52 K SNP chip to detect 192 Inner Mongolia cashmere goats.Through GWAS, they found that four SNPs were significantly associated with cashmere fiber length, fiber diameter, and yield.These SNPs were located in genes such as FGF12, SEMA3D, EVPL, and SOX5, which may have potential effects on the cashmere traits of Inner Mongolia cashmere goats.At present, although there are a few studies on cashmere traits, it is still necessary to expand the population in different regions, increase the sequencing depth, and use statistical methods to mine the genetic markers that actually influence the phenotype of cashmere goats.In this study, 404 Inner Mongolia cashmere goat individuals were subjected to whole genome resequencing, and the obtained SNPs were quality-controlled and then the linear mixed model of GCTA software was used to analyze the association of cashmere traits respectively, in order to search for the causal variation and potential candidate genes of the cashmere traits, which provided some theoretical foundation and basis for the subsequent breeding work and genetic study of Inner Mongolia cashmere goats.

Source of experimental animal and phenotype data
The experimental goats in this study were all from Erlangshan Ranch of Inner Mongolia Beiping Textile Co., Ltd.(Inner Mongolia cashmere goat Erlangshan type national protected breeding farm), and the breeds of cashmere goats raised were all Erlangshan type Inner Mongolia cashmere goats.We screened 404 Inner Mongolia cashmere goats for resequencing from 9 herds at Erlangshan Ranch.Of these, 84 were rams and 320 were ewes.The ram population consists of 75 one-year-old rams, 4 twoyear-old rams, and 5 three-year-old rams; The ewe population consists of 92 ewes aged one year, 54 ewes aged two years, and 174 ewes aged three years.The number of cashmere goats in herd 1-9 was 9, 75, 41, 49, 14, 29, 46, 92, 49.According to the method described in Table 1, the phenotypic data of cashmere traits of 3842 individuals from 2020 to 2023 were determined, including 1401 males and 2441 females.The genealogy can be traced back to two generations.Regarding phenotypes, there were 6966 records for CY, 6814 records for CD, 5830 records for CT, and 5831 records for FL, respectively.Descriptive statistical analysis of the phenotype and analysis pipeline is presented in Fig. 1.We selected 96 individuals of Inner Mongolia cashmere goats with complete phenotypic records from the 10th herd to extract DNA.
Where y is a vector of phenotypic values; b is a vector of fixed effects (including herd-year of birth and age); a is the vector of additive genetic effects, and its normal distribution is a ~ N(0, Aσ 2 a ), σ 2 a is the additive genetic variance of traits, and A is the additive genetic correlation matrix between individuals.P is a vector of the permanent environmental effect, its normal distribution is p ~ N(0, Iσ 2 P ), σ 2 p is the permanent environmental variance, I is the identity matrix; X, Z1, and Z2 are the correlation matrices of b, a, and p, respectively; e is the residual vector, its normal distribution is e ~ N(0, Iσ 2 e ), I is the identity matrix, σ 2 e is the residual variance.The estimated variance components are calculated using the following formula to calculate the corresponding genetic parameters: Among them, h 2 represents heritability; σ 2 a represents additive genetic variance; σ 2 e represents the residual variance; r e represents repeatability; σ 2 ep represents the permanent environmental variance; σ 2 et represents temporary environmental variance; r A represents genetic correlation between traits.Cov(a 1 , a 2 ) represents the additive effect covariance of traits a 1 and a 2 .σ 2 a 1 represents the additive variance of trait a 1 .σ 2 a 2 represents the additive variance of trait a 2 .r P represents the pheno- typic correlation between traits.Cov(p 1 , p 2 ) represents the phenotypic covariance of p 1 and p 2 .σ 2 p 1 represents the phenotypic variance of the trait p 1 .σ 2 p 2 represents the phenotypic variance of the trait p 2 .

Genomic DNA extraction and quality control
Ear tissue samples were collected from 404 individuals of Inner Mongolia cashmere goats (Erwangshan type).During the sampling, the fleece of Inner Mongolia cashmere goat's ear posterior margin was cut off first, disinfected with 75% ethanol, and 0.5 g of ear tissue near 1/3 of the ear tip was cut off with ear amputer and placed into 1.5 mL cryotube.Because of the small distribution of blood vessels at the back edge of the ear, there will be no bleeding after sampling.After sampling, apply pain medication to the wound immediately, and the animals were released and not sacrificed in the process.All experiments and procedures were carried out following the Scientific Research and Academic Ethics Committee of Inner Mongolia Agricultural University and the Biomedical Research Ethics of Inner Mongolia Agricultural University (Approval No. [2020] 056).No specific permissions were required for these activities, and no endangered or protected species were involved.All samples were stored in liquid nitrogen immediately after collection, and transported to the laboratory for long-term storage at -80℃.DNA was extracted from the ear tissue samples using the phenol-chloroform method, and DNA concentration, the absorption wavelength ratios of the highest absorption peaks of nucleic acids, proteins, carbohydrates (260 nm/230 nm) and phenolics (260 nm/280 nm) were detected by a spectrophotometer (NanoDrop2000).The 1% agarose gel was used to detect and evaluate the DNA quality, Store qualified DNA samples in a -20 ℃ refrigerator for future use.

Library construction and on-line sequencing
After the qualified genomic DNA samples were processed, the genomic DNA was randomly interrupted into 350 bp length fragments by Covaris ultrasonic crusher.
The DNA fragments were subjected to end repair, addition of poly(A), addition of sequencing adapters, purification, PCR amplification and other steps to complete the whole library preparation.After the library was constructed, Qubit 2.0 was used for preliminary quantification, and qPCR was used to accurately quantify the effective concentration of the library to ensure the quality of the library.After the quality of the library passed the test, sequencing was carried out using the DNBSEQ-T7 sequencing platform, and the sequencing mode was PE150 mode.

Identification, screening and annotation of mutation sites
Filter Raw reads data into Clean reads data (-n 10 -q 20 -u 40) using fastp software (V0.20.0)[3], establish a genome index for the reference genome, and the Burrows-Wheeler Aligner (BWA) software (V0.7.17) [4] was used to compare (mem -T 20 -5a) the quality controlled Clean reads with the goat reference genome (ARS1, GCF_001704415.1); using SAMtools software (V1.8-20) [5] to convert (view -b) the compared sam files into bam files and sort (sort) the bam files; MarkDuplicates program in Genome Analysis Toolkit (GATK) software (V3.8) [6] was used to remove duplicates 0" -filter-name "Filter") by using the VariantFiltration module after obtaining the VCF file.Functional annotation of the detected gene variants was performed using the ANNOVAR software package [7].Based on the position of the variant site on the reference genome and the gene position information on the reference genome, the region of the genome in which the variant site occurs (intergenic region, intronic region, or CDS region, etc.), as well as the effect of the variant (synonymous non-synonymous mutation, etc.) can be obtained.

Data quality control, genetic relationship analysis based on IBS distance matrix and G matrix, PCA analysis
The obtained genotyping data were quality controlled using Plink (V1.90) software [8] to exclude individuals with genotype detection rate (call rate) < 98%, SNPs with call rate < 98%, SNPs with minimum allele frequency (MAF) < 5%, and SNPs with P-value of the Hardy Weinberg Equilibrium (HWE) test < 10 -6 .The identity by state (IBS) matrix and the G matrix of genomic relatedness were constructed by using Plink (V1.90) software.Finally, R (V3.6.0)[9] plotting was utilized for the visual presentation of genetic distance and kinship results.The "-pca 3" parameter of Plink (V1.90) software was used to calculate the first three principal components, and the PCA plot was drawn using R (V3.6.0).

Genome-wide association study
The estimated breeding values (EBV) for CY, CD, CT, and FL traits were estimated according to average information restricted maximum likelihood (AI-REML) using the single traits repeatability model and the DMUAI module of DMU software [10], The corrected phenotypic value is obtained by adding the estimated breeding value and the residual, as described below: where y c is the vector of corrected phenotypic value; b is the vector of fixed effects, including herd-year of birth and age; a is the vector of additive genetic effect with a normal distribution is a ~ N(0, Aσ 2 a ), σ 2 a is the additive genetic variance of the trait, and A is the additive genetic y c = Xb + Z 1 a + Z 2 p + e relationship matrix between individuals; p is the vector of permanent environmental effect with a normal distribution is p ~ N(0, Iσ 2 P ), σ 2 p is the variance of permanent environment, I is the unit matrix; X, Z 1 and Z 2 are the incidence matrices of b, a and p, respectively; e is the residual vector with a normal distribution is e ~ N(0, Iσ 2 e ), I is the unit matrix, and σ 2 e is the residual variance.Association analysis between SNPs and individual traits (CY, CD, CT, and FL) was performed using the fastGWA-mlm model in the GCTA (V1.94.0beta) software [11].
Where y is an n × 1 vector of mean-centered phenotypes; X snp is a vector of mean-centered genotype variables of a variant of interest with its effect β snp ; g is a vector of the total genetic effects captured by SNP-derived genetic relationship matrix (GRM) with g ~ N(0, πσ 2 g ); π is a SNP-derived GRM vector of with all of the small offdiagonal elements set to 0; e is a vector of residuals with e ~ N(0, Iσ 2 e ).The threshold of significance for determining GWAS is too stringent due to the use of the Bonferroni correction method.In this study, the threshold for genomewide significant associations was adjusted to P = 1 × 10 -6 .The genome inflation factor (i.e., λ) for the test statistic was calculated from the slope of the linear regression between the observed and theoretical quartiles in R (V4.2.2).Significant SNPs are shown as threshold lines in the Manhattan plot.Manhattan plots and quantile-quantile (QQ) plots were generated by the CMplot package in R (https:// github.com/ YinLi Lin/R-CMplot, 20 January 2019).

Kompetitive allele specific PCR
A novel competitive SNP genotyping method (KASP) was used to verify 8 SNPs that were significantly associated with the cashmere yield, cashmere diameter, cashmere thickness, and fleece length of Inner Mongolia cashmere goats.
All DNA samples diluted to a similar concentration equivalent to 40 ng/sample were utilized for genotyping.Primer5 software was used to design SNP-specific targeted KASP detection primers, and the primers sequence was shown in Table 2.

Statistical analyses
To identify potential candidate genes associated with cashmere traits, significantly associated SNPs were annotated using Bedtools (V2.30.0) software [12] based on the goat ARS1 version of the genome.SNPs were localized to the identified genes, and only those genes that were located within a 1 Mb (± 500 kb) range around each SNP were considered.In the case of more than two identified genes within the identified range, preference was given to the gene where the SNP was located, or the gene where the SNP was closest.However, if the SNP was located between two genes, both genes were selected and the remaining gene was discarded.
In addition, candidate genes were analyzed for GO function annotation and KEGG enrichment using the clusterProfiler package [13] in R (V4.2.2).KEGG pathways and GO biological processes of candidate genes were manually searched to infer potential gene functions.Based on the principles of calculating parameters such as allele frequency, genotype frequency, homozygosity (Ho) and heterozygosity (He), we wrote Excel functions to perform the calculation of population genetics parameters, as well as to test whether the population conforms to the HWE principle.Haplotype analysis was carried out using LDBlockShow (V1.40) software [14] to construct haplotype blocks to detect the region near the significant loci and whether they were in a highly interlocked block, and Haploview software [15] was used to conduct linkage disequilibrium analysis on SNPs of Inner Mongolia cashmere goats whose cashmere traits were significantly associated with HWE.One-way analysis of variance (ANOVA) was performed using SAS (V9.2) software [16], and the t-test was applied when only two genotypes were found in a particular locus, and phenotype-genotype association analysis was performed using SAS (V9.2) software, and the general linear model for the genotypegenotype association analysis was: u being the overall mean, G i representing the genotype effect, and E ij representing the random error.Duncan's multiple comparisons were used to assess the significance of differences between genotypes (P < 0.05) and data were expressed as "mean ± standard error".Associations between the genotypes of 8 SNPs and cashmere traits were evaluated with a one-way analysis of variance (= three genotypes and sample size greater than or equal to 3) and independent sample t-test (= two genotypes) in SPSS Statistics (V25.0).When the analysis of variance performed on each group of genotypes indicated a significant difference (P < 0.05), statistical differences between the two genotypes were subsequently evaluated with the Bonferroni correction test.

Estimation of genetic parameters of cashmere traits
The heritability of cashmere yield, cashmere diameter, and fleece length traits of Inner Mongolia cashmere goats were 0.229, 0.359, and 0.250, which belonged to the medium heritability traits (0.2 ~ 0.4).The cashmere thickness trait has a low heritability of 0.053 (Table 3).
In addition, we calculated the repeatability of each trait.The repeatability of cashmere yield, cashmere diameter, and fleece length traits was higher: 0.372 ± 0.016, 0.555 ± 0.013, and 0.410 ± 0.018, respectively.The repeatability of the cashmere thickness character is low (0.207 ± 0.021).The genetic correlation between the cashmere yield trait and the other three traits was 0.212 ~ 0.835, all of which were positive.The genetic correlation between cashmere yield and cashmere thickness traits was as high as 0.835.The phenotypic correlation between the cashmere yield trait and the other 3 traits ranged from 0.168 to 0.290, all of which were positive.The phenotypic correlation between cashmere yield and cashmere thickness was the highest (0.290).The genetic correlation between cashmere diameter and cashmere thickness traits was positive (0.393), but the genetic correlation with fleece length traits was very low.Cashmere diameter was positively correlated with the phenotypic correlation of cashmere thickness and fleece length.The genetic and phenotypic correlations of cashmere thickness and fleece length traits were 0.475 and 0.252, respectively.

Identification of genomic variations in Inner Mongolia cashmere goats
Whole genome resequencing was performed on 404 individuals of the Inner Mongolia cashmere goats (Erlangshan type), generating raw reads with a total size of 26,835.11Gb.After variant detection and strict quality control, a total of 39,509,854 variants, including SNPs and Indels, were identified in the Inner Mongolia cashmere goat population (Fig. 2a).Then, all detected variants in Inner Mongolia cashmere goats were annotated using gene annotation files downloaded from the Ensembl database, and it was discovered that the highest number of variants were found in intergenic regions (59.09%) and intronic regions (34.34%) (Fig. 2b), of which only 0.79% were located in the coding regions, including 141,732 synonymous mutations and 120,751 non synonymous mutations (Fig. 2c).These potential functional variants provide valuable genetic resources for exploring the genetic structure and functional genes of Inner Mongolia cashmere goats.

Data quality control, genetic relationship analysis and PCA analysis
A total of 34,248,064 SNPs were involved in quality control, 695,497 SNPs with a detection rate of less than 98% (-geno 0.02) were rejected, and the remaining loci were then filtered by HWE filtering, MAF filtering, and individual detection rate filtering (-maf 0.05, -hwe 1e-6,mind 0.02), and a total of 17,135,082 SNPs loci were used for subsequent analysis, which were evenly distributed on 29 pairs of autosomes in goats (Fig. 3a).Genetic distance analysis based on IBS and genomic genetic relationship analysis based on G-matrix could analyze the population without knowing the population lineage or ancestry samples.The results showed that the IBS distance values of the Inner Mongolia cashmere goat population ranged from 0.1298 to 0.2626, with an average genetic distance of 0.2400 ± 0.0111, indicating that the average genetic distance between individuals of Inner Mongolia cashmere goats was farther away and varied greatly.The visualization results of the IBS genetic distance matrix and G-matrix genetic relationship of the whole population are shown in Fig. 3b,  c.Most of the individuals were moderately related to Table 3 The heritability of cashmere traits and the genetic and phenotypic correlations among traits in Inner Mongolia cashmere goats The upper triangle is genetic correlation (standard error), the lower triangle is phenotypic correlation (standard error), and the diagonal is heritability (standard error) each other, while some of the cashmere goat individuals were more closely related to each other.The reason was the presence of both parents and their offspring in the sequenced individuals, which resulted in closer relatedness.We calculated the first three principal components by using Plink (V1.90) software, and the results showed that there was population stratification in the test sample (Fig. 3d).Therefore, we used "breeding value + residual" as the correction phenotype value to effectively correct the population stratification and ensure the accuracy of the experimental results.

GWAS for cashmere traits and functional annotation of candidate genes
Cashmere yield GWAS analysis of the cashmere yield trait based on resequencing data from 404 Inner Mongolia cashmere goats detected 28 significant SNPs (Fig. 4) according to the corrected threshold (P < 1 × 10 -6 ), which were located on chromosomes 1, 6, 7, 9, 10, and 29, respectively (Table 4).Three SNPs on chromosome 7 annotated to the interior of the gene, chr7_g.100291226T > C with chr7_g.100293016A> G located within the gene Among the 28 significantly associated SNPs, there were two SNPs that explained more than 3% of the phenotypic variation.The SNPs that explained the most phenotypic variation were chr9_g.1627149C> T and chr9_g.1627193C> A on chromosome 9, indicating a large contribution to the phenotype, and the phenotypic variation explained by these 2 SNPs was 3.43%, corresponding to the smallest, i.e., the most significant, p-value of 4.71 × 10 -9 , but the corresponding effect value was small, at -122.83.We believe that these 2 SNPs may be important genetic loci that play an important role in the growth and development of hair follicles in Inner Mongolia cashmere goats.The SNPs with the smallest explained phenotypic variation were chr7_g.102631194A> G, with the largest p-value of 8.75 × 10 -7 , corresponding to a larger effect value of -74.90.

Cashmere diameter
Genome-wide association study for the cashmere diameter trait was performed based on resequencing data from 404 Inner Mongolia cashmere goats, and 38 significant SNPs were detected according to the corrected threshold (P < 1 × 10 -6 ) (Fig. 4), which were located on chromosomes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 14, 18, and 21, respectively (Table 4).on chromosome 2 chr2_g.128598723C> A was annotated into the GULP1 gene; 5 SNPs on chromosome     on chromosome 8 was annotated into the GALNTL6 within the gene.Among the 38 significantly associated SNPs, there were 5 SNPs that explained more than 3% of the phenotypic variation.The SNP with the largest explained phenotypic variation was chr7_g.10569445C> T on chromosome 7, which was able to explain 3.97% of the phenotypic variation, corresponding to the smallest, i.e., the most significant, p-value value of 1.08 × 10 -9 , corresponding to the smallest effect value of -0.59 among the 38 SNPs; and the SNP with the smallest explained phenotypic variation was chr10_g.10834113C> G, with the largest p-value of 9.18 × 10 -7 , but corresponding to a larger effect value of 0.26.

Cashmere thickness
Genome-wide association study of cashmere thickness traits was performed based on resequencing data from 404 Inner Mongolia cashmere goats, and 18 significant SNPs were detected according to the corrected threshold (P < 1 × 10 -6 ) (Fig. 4), which were located on chromosomes 2 and 24, respectively (Table 4).all of the significant SNPs on chromosome 2 were annotated to the GTDC1 gene; and all of the significant SNPs on chromosome 24, except chr24_g.12811352C> T, were annotated to the PIK3C3 gene.The significantly associated SNPs on chromosome 2 were all annotated to the GTDC1 gene; the significantly associated SNPs on chromosome 24 were all annotated to the PIK3C3 gene, except for chr24_g.12811352C> T.
Among the 18 significantly associated SNPs, there were 3 SNPs that explained more than 3% of the phenotypic variance.The SNP with the largest explained phenotypic variation was chr2_g.83536939A> C on chromosome 2, which was able to explain 3.87% of the phenotypic variation, corresponding to the smallest, i.e., the most significant, p-value of 2.48 × 10 -9 , but corresponding to a small effect value of -0.44; the SNP with the smallest explained phenotypic variation was chr24_g.14175958A> C, with the largest p-value value of 9.73 × 10 -7 , and also corresponding to the largest effect value of -0.32.

Fleece length
Genome-wide association study of fleece length traits based on resequencing data from 404 Inner Mongolia cashmere goats detected 67 significant SNPs (Fig. 4) according to the corrected threshold (P < 1 × 10 -6 ), which were located on chromosomes 2, 3, 5, 6, 13, 15, 16, 18, 20, 21, 26, 28, and 29, respectively (Table 4).6 SNPs on chromosome 2 were annotated to the RHCE gene; 13 SNPs on chromosome 3 were annotated to the PIGK gene; and the first 6 SNPs on chromosome 6 were annotated to the SORCS2 gene, the last 18 were annotated to the RGS12 gene; the 4 SNPs on chromosome 13 were annotated to the PLCB4 gene; the chr16_g.76396893G> A on chromosome 16 was annotated to within LOC102174324.Among the 67 significantly associated SNPs, there were 3 SNPs that explained more than 3% of the phenotypic variation.The SNPs with the largest explained phenotypic variation were chr3_g.53647318C> T, chr3_ g.53647702 T > C and chr3_g.53649874C> T on chromosome 3.The phenotypic variation that could be explained by these three SNPs was 3.02%, which corresponded to a smaller p-value of 6.58 × 10 -8 , corresponding to a small effect value of -0.85; the SNPs with the smallest explained phenotypic variation were chr6_g.114106682G> T on chromosome 6, with a smaller p-value value of 9.81 × 10 -7 , corresponding to a small effect value of 0.93.

Candidate Gene GO functional annotation and KEGG enrichment analysis
We annotated 392 genes in 151 candidate regions and analyzed the candidate genes for GO functional annotation and KEGG enrichment using clusterProfiler package.The results obtained are shown in Table S2, S3, and these genes were enriched in 1674 GO and 209 KEGG pathways.The GO functional annotation results showed that these candidate genes were involved in the Biological Process of G2/M transition of mitotic cell cycle, Ras protein signal transduction, cell morphogenesis, epithelial cell development and hair cycle process, etc.; in terms of cellular component, they are involved in supramolecular fiber, anchoring junction, cell-cell junction, supramolecular polymer, nucleolus, etc.; in terms of molecular function, it is involved in proton transmembrane transporter activity, calcium ion binding, monoatomic ion transmembrane transporter activity, mRNA binding, ATP hydrolysis activity.For KEGG analysis, most of the important pathways are related to Environmental Information Processing, Cellular Processes and Metabolism, including Notch, P13K-Akt, TGF-beta, Cell cycle, Wnt, MAPK, Hippo, Ras signaling pathway, etc. (Fig. 5a, b).

Population genetic parameters
Population genetic parameters of SNPs and potential candidate genes significantly associated with cashmere traits in Inner Mongoliacashmere goats are shown in Table 5.All 151 SNPs were quality controlled, and those with genotypic deletions were excluded, leaving 136 SNPs for subsequent analysis.Among them, 3 genotypes were detected for all 121 SNPs, of which 24 SNPs were significantly associated with cashmere yield, 27 SNPs were significantly associated with cashmere diameter, 17 SNPs were significantly associated with cashmere thickness, and 53 SNPs were significantly associated with fleece length; and 2 genotypes were detected for the other 15 SNPs, of which 4 SNPs were significantly associated with cashmere yield, 5 SNPs were significantly associated with cashmere diameter was significantly associated, and 6 SNPs were significantly associated with fleece length.The dominant genotypes of 92 SNPs were wild-type, and the dominant genotypes of 44 SNPs were heterozygous.The heterozygosity in this study is observed heterozygosity.It refers to the proportion of individuals in the    population who are heterozygous at a certain site to the total number of individuals [17].The lowest heterozygosity of 0.099 was found for chr1_g.133340208T > G and chr21_g.49120096C> A, and the highest heterozygosity of 0.500 was found for chr1_g.124483769C> T. The effective allele numbers of the Inner Mongolia cashmere goat population ranged from 1.109 to 1.998.Polymorphic information content is related to the number and frequency of allele genes in a population, and is used to represent the degree of polymorphism at a certain site in a population to evaluate the genetic diversity of a population [18].The polymorphic information content of 57 SNPs were all low polymorphism (0 < PIC < 0.25), and the polymorphic information content of 79 SNPs were all moderate polymorphism (0.25 < PIC < 0.50).It indicates that the genetic diversity of Inner Mongolia cashmere goat population is relatively high.The chi-square test found that 26 SNPs significantly deviated from the HWE (P < 0.05) and were not available for subsequent analysis; 110 SNPs conformed to the HWE (P > 0.05) and were available for subsequent analysis.

Association analysis of SNPs polymorphisms with phenotypic values of cashmere traits
The 110 SNPs mentioned above that were conform for HWE were analyzed for association with cashmere traits, and the results of the association analysis of SNPs significantly associated with cashmere traits and potential candidate genes with the phenotypic values of cashmere traits in Inner Mongolia cashmere goats are shown in Fig. 6 and Table S4.17 SNPs (SNP4, SNP5, SNP6, SNP7, SNP9, SNP10, SNP13, SNP14, SNP15, SNP16, SNP19, SNP20, SNP21, SNP22, SNP23, SNP24, SNP25) wild-type were significantly higher than those of the homozygous mutant (P < 0.05); SNP8, SNP17 and SNP18 wild type were significantly higher than those of the heterozygous mutant and the homozygous mutant (P < 0.05); SNP11, SNP12, SNP27 and SNP28 wild types were significantly higher than heterozygous mutants (P < 0.05); the cashmere yield of SNP1 and SNP3 homozygous mutants was significantly higher than that of the wild type and the heterozygous mutants (P < 0.05); moreover, the homozygous mutant types of these 2 SNPs were associated with the highest cashmere yield; and the cashmere yield of SNP26 wild type was significantly higher than the heterozygous mutant (P < 0.05), and the heterozygous mutant had significantly higher cashmere yield than the homozygous mutant (P < 0.05).SNP29 and SNP38 homozygous mutants had significantly better cashmere diameter than heterozygous mutants (P < 0.05) and heterozygous mutants were significantly better than wild type (P < 0.05); SNP30 and SNP66 heterozygous mutants had significantly better cashmere diameter than wild type individuals (P < 0.05) and heterozygous mutants of SNP30 were associated with the lowest cashmere diameter; SNP34, SNP35, SNP36, SNP37, SNP61, and SNP65 wild type were significantly better than heterozygous mutants (P < 0.05); SNP39 homozygous and heterozygous mutants were significantly better cashmere diameter than wild type individuals (P < 0.05), while there was no significant difference between homozygous and heterozygous mutant individuals (P > 0.05); The cashmere diameter of SNP41 and SNP62 wild type was significantly better than heterozygous mutant (P < 0.05), and heterozygous mutant was significantly better than homozygous mutant (P < 0.05); the cashmere diameter of wild type and heterozygous mutant at SNP56 and SNP64 was significantly better than that of homozygous mutant (P < 0.05), and the cashmere diameter of wild type was better than heterozygous mutant but the difference was not significant (P > 0.05).
The cashmere thickness of the wild type of 13 SNPs (SNP67, SNP68, SNP73, SNP74, SNP75, SNP76, SNP77, SNP78, SNP80, SNP81, SNP82, SNP83, SNP84) was significantly higher than that of the heterozygous mutant and the homozygous mutant (P < 0.05) and there was no significant difference between the heterozygous mutant and the homozygous mutant individuals were not significantly different from each other (P > 0.05), the wild types of SNP76, SNP77, SNP78, SNP80, and SNP81 were associated with the highest cashmere thickness; the cashmere thickness of SNP79 wild type individuals was significantly higher than that of the homozygous mutant type (P < 0.05), and the differences among the genotypes of SNP69, SNP70, and SNP72 were insignificant (P > 0.05).
Seventeen SNPs (SNP85, SNP87, SNP88, SNP89, SNP116, SNP119, SNP120, SNP121, SNP122, SNP123, SNP124, SNP125, SNP126, SNP130, SNP131, SNP132, and SNP133) wild type and heterozygous mutant had significantly higher cashmere length than the homozygous mutant (P < 0.05), and the wild type was higher than the heterozygous mutant but there was no significant difference (P > 0.05); 11 SNPs (SNP91, SNP92, SNP94, SNP95, SNP96, SNP97, SNP106, SNP134, SNP135, SNP136, SNP137) wild type had significantly higher fleece length than heterozygous mutants (P < 0.05) and heterozygous mutants had significantly higher fleece length than homozygous mutants (P < 0.05), and the wild type of SNP137 was associated with the highest fleece length; 9 SNPs (SNP107, SNP108, SNP109, SNP110, SNP111, SNP112 SNP113, SNP114, SNP115) had significantly higher fleece lengths in the homozygous mutant than in the heterozygous mutant (P < 0.05), while the heterozygous mutant had significantly higher fleece lengths than the wild type (P < 0.05); SNP104, SNP105, SNP138, and SNP144 heterozygous mutants had significantly higher fleece lengths than the wild type (P < 0.05); SNP139 and SNP145 homozygous and heterozygous mutants had significantly higher fleece length than wild type (P < 0.05), and the fleece length of homozygous mutants was higher than that of heterozygous mutants but the difference between the two was not significant (P > 0.05); the fleece length of wild type individuals of SNP142, SNP148 and SNP149 was significantly higher than that of heterozygous mutants (P < 0.05); SNP146 and SNP147 wild type fleece length was significantly higher than heterozygous mutant and homozygous mutant (P < 0.05), and heterozygous mutant was higher than homozygous mutant fleece length but the difference between them was not significant (P > 0.05).

Haplotype block analysis
Since most of the significant SNPs on chromosomes 2, 3, 6, 7, 9, 13, 16, and 29 are in close proximity to each other and may have strong linkage disequilibrium, the 151 SNPs that were significantly associated with the cashmere traits were analyzed using the LDBlockShow (v1.40) software in accordance with the D' > 0.33, r 2 > 0.1 [19] to construct haplotype blocks on a genome wide scale.The SNPs significantly associated with cashmere yield formed 4 blocks with block, with block sizes ranging from 0.04 kb to 5.88 kb (Table S5, Fig. S1); The SNPs significantly associated with cashmere diameter formed 4 blocks with block sizes ranging from 3.20 kb-12.05kb (Table S5, Fig. S2); The SNPs significantly associated with cashmere thickness formed a block with a block size of 1.86 kb (Table S5, Fig. S3); The SNPs significantly associated with fleece length formed 12 blocks, with block sizes ranging from 0.01 kb to 26.54 kb (Table S5, Fig. S4).

Linkage disequilibrium analysis
Based on the above screening results, Haploview software was used to analyze the linkage disequilibrium of SNPs that were significantly associated with cashmere traits in Inner Mongolia cashmere goats and conformed HWE.The 15 SNPs significantly associated with cashmere yield traits in Inner Mongolia cashmere goats constituted 4 Blocks (Fig. 7), and there was a complete linkage (D' = 1) between chr7_g.100291226T > C and chr7_g.100293016A> G of (See figure on next page.)Fig. 7 Results of linkage disequilibrium analysis and haplotype block of significantly associated SNPs of cashmere yield in IMCGs.a D' value, r 2 value and haplotype frequency of block formed by 2 SNPs in the LOC102170865 gene.b D' value, r 2 value and haplotype frequency of block formed by 2 SNPs in the LOC108636448 gene.c D' value, r 2 value and haplotype frequency of block formed by 9 SNPs in the COL12A1 gene.d D' value, r 2 value and haplotype frequency of block formed by 2 SNPs in the LOC102178345 gene.The linkage between loci was judged by D' and r 2 values, where D' = 1 or r 2 = 1 is called full linkage, D' = 0 or r 2 = 0 is no linkage or linkage equilibrium, D' > 0.80 and r 2 > 0.33 suggests strong linkage.The following is the same as in the previous sentence LOC102170865 gene; complete linkage (D' = 1) between chr7_g.100300867G> A and chr7_g.100301174G> A of LOC108636448 gene.There was a strong linkage disequilibrium (D' > 0.8 and r 2 > 0.33) between chr9_g.1622595C> G and chr9_g.1624552T > G, chr9_g.1627029G > A, chr9_g.1627149C> T, chr9_g.1627193C > A, chr9_g.1627936T > C, chr9_ g.1628474C > A of COL12A1 gene, and complete linkage (D' = 1) between other SNPs.chr29_ g.38178615G > A and chr29_ g.38178652G > A have strong linkage disequilibrium (D' > 0.8 and r 2 > 0.33) between them of LOC102178345 gene.

Association analysis of haplotype combinations with cashmere traits
The association analysis of haplotype combinations associated with cashmere traits and cashmere trait phenotypes in Inner Mongolia cashmere goats is shown in Fig. 11 and Table S6.Haplotype combinations with the number of individuals less than 3 were not involved in the multiple comparisons, and among the haplotype combinations constructed by the SNPs that were significantly associated with the cashmere traits, we obtained 3, 3, 5, and 2 haplotype combinations associated with the cashmere yield trait, 2, 2, and 6 haplotype combinations associated with cashmere diameter trait, 3 haplotype combinations associated with cashmere thickness traits, and 3, 5, 4, 5, 6, 5, 7, 4, 5, and 4 haplotype combinations associated with fleece length trait, respectively.
The cashmere yield of the A2A2 haplotype combination of the LOC102170865 gene was significantly higher than that of the A1A1 haplotype combination (P < 0.05), the B2B2 haplotype combination of the LOC108636448 gene was significantly higher than that of the B1B1 haplotype combination (P < 0.05), the C2C2 haplotype combination of the COL12A1 gene was significantly higher than that of the C2C3 and C4C4 haplotype combination (P < 0.05), and the D4D4 haplotype combination of the LOC102178345 gene was significantly higher than that of the D4D1 haplotype combination (P < 0.05).The cashmere diameter of the E2E2 haplotype combination of PDLIM5 gene is significantly better than that of the E2E1 haplotype combination (P < 0.05), the cashmere diameter of the F1F1 haplotype combination of PDLIM5 gene is significantly better than that of the F1F2 haplotype combination (P < 0.05), and the cashmere diameter of the G5G5 and G1G5 haplotype combination is significantly better than that of the G2G2 and G1G2 haplotype combination (P < 0.05).The cashmere thickness of the H2H2 haplotype combination of the GTDC1 gene was significantly higher than that of the H2H1 and H1H1 haplotype combinations (P < 0.05).The fleece length of the I1I1 and I1I2 haplotype combinations of the RHCE gene was significantly higher than that of the I2I2 combination (P < 0.05), the hair length of J1J4 of the PIGK gene was significantly higher than that of J5J5 combination (P < 0.05), the fleece length of the L5L3 haplotype combination of the SORCS2 gene was significantly higher than that of L2L3, L2L2, and L2L5 combination (P < 0.05), the N3N2 and N3N3 haplotype combinations of the RGS12 gene were significantly higher than those of N2N1 and N1N1 combination (P < 0.05), and the O2O1 haplotype combination of the RGS12 gene was significantly higher than that of O2O3 The haplotype combination of O3O3 and O4O3 (P < 0.05), the P2P2 haplotype combination of RGS12 gene was significantly higher than the P3P3 haplotype combination (P < 0.05), and the Q3Q3 haplotype combination of PLCB4 gene was significantly higher than the Q3Q2 and Q2Q4 combinations (P < 0.05).There was

KASP validated SNPs polymorphism
The KASP genotyping results of 8 SNPs in Inner Mongolia cashmere goats were shown in Fig. 12 and Table 7, and all SNPs were successfully genotyped in 96 individuals.Correlation analysis between SNP and cashmere traits showed that the cashmere yield of AA genotype with chr7_g.102631194A> G was significantly increased by 140.75 g compared with AG genotype (P = 0.0017).The cashmere yield of the CC genotype with chr10_g.82715068T > C was significantly increased by 160.77 g compared with the TT genotype (P = 0.0165).The cashmere diameter of the TT genotype with chr1_g.124483769C> T was significantly decreased by 2.085 μm compared with the CC genotype (P = 0.0306).The cashmere thickness of CC genotype with chr24_g.12811352C> T was significantly increased by 0.881 cm compared with CT genotype (P = 0.0389).The fleece length of the GG genotype with chr6_g.114111249A> G was significantly increased by 2.261 cm compared with the AA genotype (P = 0.0220).The fleece length of the CC genotype with chr6_g.115606026T > C was significantly increased by 1.597 cm compared with the TT genotype (P = 0.0161).

Discussion
In this study, the estimated heritability values of cashmere yield, cashmere diameter, and fleece length traits of Inner Mongolia cashmere goats were 0.229, 0.359, The estimated heritability of cashmere thickness is 0.053, which is a low heritability trait.In other studies, the heritability of Inner Mongolia cashmere goat cashmere yield was 0.24, 0.30, and 0.34, and the heritability of cashmere diameter was 0.27, 0.28, and 0.32, respectively.The heritability of fleece length was 0.29, 0.31, and 0.32, respectively, all belonging to the medium heritability, which was consistent with the results of this study.For cashmere thickness traits, the results of this study are lower than those of Fenghong Wang, Xuewu Li, and Junyan Bai (0.14, 0.17, and 0.21) [20][21][22], which may be due to the difference in the number of effective data pieces and estimation methods.Although there are some differences between this study and some other studies, most studies show that cashmere thickness is a low heritability trait.
This study showed that cashmere yield was positively correlated with cashmere diameter, cashmere thickness, and fleece length, indicating that the higher the cashmere yield, the coarser the cashmere diameter, the greater the cashmere thickness, the longer the fleece length.We found that the genetic correlation between cashmere yield and cashmere thickness was as high as 0.835, indicating that the cashmere yield of Inner Mongolia cashmere goats was largely determined by cashmere thickness.In other words, as cashmere thickness increases, cashmere yield increases significantly.The genetic correlation between cashmere yield and cashmere diameter is 0.368, indicating that cashmere grows thicker with the increase in cashmere yield, which is contrary to our breeding goal of "Inner Mongolia cashmere goats producing high yield and good quality cashmere."In the process of breeding, it is difficult to achieve both cashmere yield and cashmere quality, so in practice, we should establish different core groups according to the needs, such as fine cashmere goats and high-yield cashmere goats.The genetic correlation between cashmere thickness and fleece length was 0.475, indicating that cashmere thickness traits could be indirectly increased and improved by fleece length breeding.Cashmere thickness is closely related to cashmere yield.Therefore, cashmere yield can be increased by breeding long fleece cashmere goats, which will lay the foundation for the development of new Inner Mongolia cashmere goats with high yield and high quality cashmere.
Here, we conducted a GWAS work based on Inner Mongolia cashmere goat population.In the GWAS analysis, SNP effect value and significance P-value are two different concepts, and it is not the case that the larger the effect value, the more significant it is.In fact, the SNP with small effect values but significant belong to that the SNP have a direct effect on the phenotypic value.However, the effect of the influence is smaller and more stable, which is why they show a highly significant and small effect.For example, chr9_g.1627149C> T and chr9_g.1627193C> A on chromosome 9, which affect the trait of cashmere yield, are significant but have small effect values. the SNP with large effect values but insignificant may be that the SNPs are easily affected by the environment, or the SNPs themselves may be insignificant.Among these 151 SNP, a considerable number of SNP explained more than 2% of the phenotypic variation, such as in the results of cashmere diameter trait, the SNP that explained the largest phenotypic variation was chr7_g.10569445C> T, and the phenotypic variation that this SNP could explain was 3.97%.Although these SNPs may not be QTNs (quantitative trait nucleotides) directly affecting the target traits, their high rate of explained phenotypic variation suggests that these SNPs can be used in marker-assisted selection as well as genomewide selection to improve selection for cashmere traits in Inner Mongolia cashmere goats.
In this study, we performed GWAS and identified a large number of candidate genes associated with hair   [23].According to clus-terProfiler enrichment analysis, Notch3 is involved in the Notch signaling pathway for intercellular signaling, which is essential for normal embryonic development in mammals.Significant SNPs associated with cashmere yield: chr6_g.3398555C> T, the annotated CCNA2 gene belongs to a highly conserved family of cell cycle proteins that binds to and activates CDC2 or CDK2 kinases, thereby promoting the cell cycle G1/S and G2/M transitions [24].In addition, it has been shown that 2 SNPs on CCNA2 have potential effects on hair density in otter rabbits [25].CCNA2 can affect the cell cycle of hair papilla cells and hair follicle stem cells [26,27].Significant SNPs associated with cashmere diameter: chr6_g.30455840T > C annotated to BMPR1B showed a trend of cycle-specific differential expression in porcine embryonic skin tissues [28], also enriched for the TGF-beta signaling pathway that regulates cellular function.Hair follicle stem cells are key to promoting hair follicle growth and homeostasis, and have a significant hair cycling regenerative capacity, and exhibit fate plasticity during skin trauma healing.It has been shown that LGR5 is a marker of homeostasis and development of hair follicle stem cells across species [29].Significant SNP associated with fleece length: chr26_g.7282942G> T annotated to CTBP2 is stably expressed in goat skin tissues [30].chr21_g.65871544A> G annotated PPP2R5C is essential in mammalian growth and development, and it has been shown that PPP2R5C g.65977743C > T is significantly associated with the number of third lactation litters of the Yunshang black goats [31].chr2_g.8298760A> T annotated STMN1 promotes the proliferative process of hair follicle cells in mice [32].STMN1 was also enriched in the MAPK signaling pathway involved in cell proliferation, differentiation and migration, implying that the gene could regulate cell cycle process.When screening candidate genes affecting the four traits, we found that SNPs significantly associated with cashmere yield and fleece length were co-annotated to the HMX1, ADRA2C, AFAP1 and ABLIM2 genes, suggesting that these four genes can regulate both cashmere yield and fleece length, and that by focusing on these genes in the selection and breeding process, we can select goat individuals with high cashmere yield and longer fleece.Moreover, as fleece length increases, so do cashmere yield and body weight [33].Selection of long fleece individuals to be retained in the breed can accelerate the genetic progression of cashmere yield and body weight in the cashmere goat, and realize the indirect selection of fleece for cashmere yield and body weight.
Haplotype information is widely used in linkage analysis, correlation studies, population genetics, etc. [34].Haplotype analysis provides more comprehensive information, and more accurate statistical results can be obtained using haplotypes than single marker analysis [35].In the genetic process, SNPs tend to be inherited as a whole to the offspring, if the density of significant SNPs obtained through GWAS is low, the range of localization intervals will increase at this time, which will cause difficulties in the subsequent search for candidate genes, and the construction of haplotypes is undoubtedly the best solution.Through statistical methods, a class of SNPs with correlations on chromosomes constitutes a collection, which is used to reflect the genetic effects of these SNPs.The advantage of the single-SNP GWAS approach is that errors in mapping order have no effect on the SNPs, but may lead to erroneous inference of haplotype alleles.In addition, the haplotype GWAS method is based on a chi-square distribution, which has more degrees of freedom than a single SNP, so for the same p-value, the likelihood ratio needs to be higher to reach the significance threshold [36].However, haplotype GWAS seems to be more advantageous for detecting complex signals because of the diversity of combinations of alleles in haplotypes.For example, there are more than two alleles in a causal mutation associated with a trait or two possible mutations in the same gene.Martin et al. [37] in exploring novel loci associated with coat color traits in Saanen goats found that both the single SNP GWAS and haplotype GWAS methods detected the highest signals for the coat color trait, whereas the smaller signals detected by the two methods did not coincide, and such a results do not exclude the possibility of false positives, and in the process of performing haplotype GWAS, factors such as family structure are also likely to have an impact on the accuracy of haplotype construction.Zhu et al. [38] performed GWAS analysis of six erythrocyte traits in 498 Alpine Merino sheep based on the single SNP and haplotype methods, and the significant candidate genes obtained by the single SNP GWAS and haplotype GWAS methods were taken to intersections and screened three significant genes, DHCR24, PLCB1 and SPATA9.In addition, FLI1, an important gene related to hematopoiesis, was also obtained by haplotype GWAS method.This suggests that we can make the results more reliable by utilizing overlapping markers detected by two different methods, while non-overlapping markers may represent new valuable associations detected by different methods.In the present study, haplotypes A2A2, B2B2, C2C2, and D4D4 were associated with elevated cashmere yield, haplotypes E2E2, F1F1, G5G5, and G1G5 were associated with reduced cashmere fineness, haplotype H2H2 was associated with elevated cashmere thickness, and haplotypes I1I1, I1I2, J1J4, L5L3, N3N2, and N3N3, O2O1, P2P2, and Q3Q3 were associated with increased cashmere length.Multiple alleles in the haplotype are co-inherited on the same chromosome and collectively influence cashmere traits.A series of SNPs in haplotypes that affect the cashmere traits of Inner Mongolia cashmere goats can lay a theoretical foundation for molecular markerassisted selection of cashmere traits of Inner Mongolia cashmere goats.
We verified that 8 newly discovered SNPs, chr7_g.102631194A> G, chr10_g.82715068T > C, chr1_g.124483769C > T, chr24_g.12811352C> T, chr6_g.114111249A > G, and chr6_g.115606026T > C were significantly genotyped in verified populations (P < 0.05).The correlation analysis results of chr6_g.30463541A> T and chr24_g.14180758C> T with cashmere traits were not significant, possibly because the P value after Bonferroni correction could not reach the significance level due to the stricter standard [39].It is also possible that the sample size of the verification group is small, sample sizes can provide imprecise or incorrect estimates of the magnitude of the observed effects [40].However, the limitation of this study is that only some Inner Mongolia cashmere goats were investigated.Therefore, it is unclear whether these results can be determined and generalized to other goat breeds.In future studies, goats from different pastures, more goat samples and goat breeds will be investigated to further confirm the results of this study and to discover more valuable new variants.

Conclusions
Using the sum of estimated breeding values and residuals as corrected phenotypic values, we detected 151 genomewide SNPs significantly associated with four cashmere traits in the Inner Mongolia cashmere goat population, and constructed 21 haplotype blocks and 68 haplotype combinations.Among them, chr7_g.102631194A> G, chr10_g.82715068T > C, chr1_g.124483769C> T, chr24_ g.12811352C > T, chr6_g.114111249A> G, and chr6_ g.115606026 T > C were significantly genotyped in verified populations (P < 0.05).These SNPs can be used as candidate sites for molecular marker-assisted selection of cashmere traits in Inner Mongolia cashmere goats.In the subsequent research, it is necessary to further expand the population size and sequencing depth, and actively carry out functional verification tests, so that these new candidate genes or haplotypes can be more accurately applied to the genetic improvement of Inner Mongolia cashmere goat populations.

Fig. 2
Fig. 2 Variants characteristics of Inner Mongolia cashmere goat population.a Genome-wide distribution of detected variants on 29 chromosomes for Inner Mongolia cashmere goats.X-axis represents 29 autosomes; Y-axis represents the number of variants.b Genome-wide annotation of Inner Mongolia cashmere goats genetic variations.X-axis represents various functional regions; Y-axis represents the number of genetic variations within various functional regions.c Statistics of variants functional annotation results in CDS region.X-axis represents various functions; Y-axis represents the number of genetic variations within various functions

Fig. 3
Fig. 3 Visualization of SNPs on chromosomes, visualization of relatedness, and visualization of PCA. a Distribution of SNPs on chromosome 1 Mb window after quality control.The left Y-axis indicates the chromosome name and the upper X-axis indicates the window size.b Visualization of the IBS genetic distance matrix.Where each small square represents the value of genetic distance between the first to the last sample two by two, the larger the value the closer to purple, i.e., the larger the genetic distance between two individuals, the more distant the kinship, and vice versa.c G-matrix visualization.Each small square represents the kinship value between the first and last samples, the larger the value the closer to orange, i.e., the closer the two individuals are related and vice versa.d Visualization of PCA.The first three explain the variance percentage (PC1, PC2, and PC3) as the X, Y, and Z axes

Fig. 4
Fig. 4 Manhattan Plots and QQ-plots showing GWAS results for four cashmere traits in Inner Mongolia cashmere goats.Genome-wide significant SNPs are shown in red.CY = cashmere yield, CD = cashmere diameter, CT = cashmere thickness, FL = fleece length

Fig. 5
Fig. 5 Bar chart of GO and KEGG enrichment analysis of the candidate genes.a represents the GO enrichment results, in the legend, BP represents biological process, CC represents cellular component, MF represents molecular function; b represents the KEGG enrichment results

Fig. 6
Fig. 6 Assocaition analysis of significantly associated SNPs with phenotypes of cashmere traits in IMCGs.110 SNPs conforming to the HWE were analyzed."*" indicates a significant difference, and "ns" indicates insignificant difference

Fig. 9
Fig. 9 Results of linkage disequilibrium analysis and haplotype block of significantly associated SNPs of cashmere thickness in IMCGs.a D' value, r 2 value and haplotype frequency of block formed by 2 SNPs in the GTDC1 gene

Fig. 10
Fig. 10 Results of linkage disequilibrium analysis and haplotype block of significantly associated SNPs of fleece length in IMCGs.a D' value, r 2 value and haplotype frequency of block formed by 3 SNPs in the RHCE gene.b D' value, r 2 value and haplotype frequency of block formed by 5 SNPs in the PIGK gene.c D' value, r 2 value and haplotype frequency of block formed by 3 SNPs in the SORCS2 gene.d D' value, r 2 value and haplotype frequency of block formed by other 3 SNPs in the SORCS2 gene.e D' value, r 2 value and haplotype frequency of block formed by 3 SNPs in the RGS12 gene.f D' value, r 2 value and haplotype frequency of block formed by 2 SNPs in the RGS12 gene.g D' value, r 2 value and haplotype frequency of block formed by 2 SNPs in the RGS12 gene.h D' value, r 2 value and haplotype frequency of block formed by 3 SNPs in the RGS12 gene.i D' value, r 2 value and haplotype frequency of block formed by 4 SNPs in the PLCB4 gene.j D' value, r 2 value and haplotype frequency of block formed by 2 SNPs

Fig. 11
Fig. 11 Association analysis between haplotype combinations and phenotypes of cashmere traits in IMCGs.Different letters indicate significant difference, and the same letters indicate insignificant difference

Table 2
Selected SNPs used in the study for verification of the Inner Mongolia cashmere goats

Table 4
Description of significantly associated SNPs and potential candidate genes of cashmere traits in IMCGs

Table 4
(continued) r 2 (%) represents the phenotypic variance explained by the SNP.Positive numbers in the distance column indicate the distance between the SNP and the gene upstream; negative numbers indicate the distance between the SNP and the gene downstream.In the table, only genes whose SNP are located within the gene or those closest to the SNP are shown as potential candidate genes.The full results of gene annotation are shown in TableS1

Table 5
Population genetic parameters of significantly associated SNPs and potential candidate genes of cashmere traits in IMCGs

Table 5
(continued) a NSNP single nucleotide polymorphism b HWE Hardy-Weinberg Equilibrium test c Ho Homozygosity d He Heterozygosity e PIC Polymorphism information content f Ne Effective allele numbers

Table 6
Haplotype combinations and frequencies associated of cashmere traits in IMCGs

Table 7
Genotyping of 8 SNPs in Inner Mongolia cashmere verified populations

Table 8
Association analysis of verified SNPs with cashmere traits